생활관 미지정 질문의 생활관별 검색 개선#39
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request refactors the dormitory search logic by introducing a custom scoring function to prioritize official sources and dormitory matches, while also increasing the initial search depth. Feedback highlights that debug print statements should be removed or replaced with proper logging, and the '탕비실' (pantry) keyword, which was inadvertently removed from the search triggers, should be restored.
Comment on lines
+426
to
+427
| print("should each dorm:", _should_search_each_dormitory(question)) | ||
| print("question:", question) |
Contributor
Comment on lines
+447
to
+456
| print("===== FINAL EACH DORMITORY CHUNKS =====") | ||
| print("chunks_count:", len(chunks)) | ||
| for index, chunk in enumerate(chunks, start=1): | ||
| print( | ||
| index, | ||
| chunk.get("document_id"), | ||
| chunk.get("dormitory"), | ||
| chunk.get("similarity"), | ||
| chunk.get("source"), | ||
| (chunk.get("content") or "")[:300], | ||
| (chunk.get("content") or "")[:200], | ||
| ) | ||
| print("================================") | ||
| print("======================================") |
Contributor
Comment on lines
+902
to
+914
| DORMITORY_SPECIFIC_SEARCH_TRIGGERS = [ | ||
| "휴게실", | ||
| "다리미", | ||
| "편의점", | ||
| "전자레인지", | ||
| "전자렌지", | ||
| "정수기", | ||
| "세탁실", | ||
| "수용인원", | ||
| "몇명", | ||
| "몇명수용", | ||
| "호실수", | ||
| ] |
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
유형
변경 사항
dormitory = null상태에서 생활관별 시설/현황 질문이 들어오면 제1·제2·제3학생생활관을 각각 검색하도록 개선했습니다.top_k안에 특정 생활관 정보가 포함되지 않아 답변에서 일부 생활관 정보가 누락될 수 있었습니다.수정 배경
dormitory = null상태에서휴게실에 뭐 있어?처럼 전체 생활관 정보를 묻는 질문을 했을 때, 특정 생활관 정보가 검색 결과에서 누락되거나 공통 팁 문서만 선택되는 문제가 있었습니다.예를 들어 제1·제2·제3학생생활관에 각각 휴게실 정보가 존재하더라도, 전체 검색 결과 상위 chunk에 일부 생활관 정보만 포함되면 답변도 일부 생활관 기준으로만 생성되었습니다.
이를 개선하기 위해 생활관별 시설/현황 질문에 대해서는 전체 검색에만 의존하지 않고, 각 생활관별 검색 결과를 합쳐 답변 생성에 사용하도록 수정했습니다.
테스트
휴게실에 뭐 있어?+dormitory = null다리미 있어?+dormitory = null기숙사 수용인원 몇명이야?+dormitory = null기대 효과
top_k를 늘리지 않고 필요한 질문에서만 생활관별 검색 수행